Cepstrum-based filter-bank design using discriminative feature extraction training at various levels
نویسندگان
چکیده
This paper investigates the realization of optimal lter bank-based cepstral parameters. The framework is the Discriminative Feature Extraction method (DFE) which iteratively estimates the lter-bank parameters according to the errors that the system makes. Various parameters of the lter-bank, such as center frequency, bandwidth, gain are optimized using a string-level optimization and a frame-level optimization scheme. Application to vowel and noisy telephone speech recognition tasks shows that the DFE method realizes a more robust classi er by appropriate feature extraction.
منابع مشابه
Fuzzy-based discriminative feature representation for children's speech recognition
Automatic recognition of the speech of children is a challenging topic in computer-based speech recognition systems. Conventional feature extraction method namely Mel-frequency cepstral coefficient (MFCC) is not efficient for children’s speech recognition. This paper proposes a novel fuzzy-based discriminative feature representation to address the recognition of Malay vowels uttered by children...
متن کاملData-Driven Filter-Bank-based Feature Extraction for Speech Recognition
Selecting good feature is especially important to achieve high speech recognition accuracy. Although the mel-cepstrum is a popular and effective feature for speech recognition, it is still unclear that the filter-bank in the mel-cepstrum is always optimal regardless of speech recognition environments or the characteristics of specific speech data. In this paper, we focus on the data-driven filt...
متن کاملData Driven Design of Filter Bank for Speech Recognition
Filter bank approach is commonly used in feature extraction phase of speech recognition (e.g. Mel frequency cepstral coefficients). Filter bank is applied for modification of magnitude spectrum according to physiological and psychological findings. However, since mechanism of human auditory system is not fully understood, the optimal filter bank parameters are not known. This work presents a me...
متن کاملImproving the filter bank of a classic speech feature extraction algorithm
The most popular speech feature extractor used in automatic speech recognition (ASR) systems today is the mel frequency cepstral coefficient (mfcc) algorithm. Introduced in 1980, the filter bank-based algorithm eventually replaced linear prediction cepstral coefficients (lpcc) as the premier front end, primarily because of mfcc’s superior robustness to additive noise. However, mfcc does not app...
متن کامل